high-order interaction model
A Confidence Machine for Sparse High-Order Interaction Model
Das, Diptesh, Ndiaye, Eugene, Takeuchi, Ichiro
In predictive modeling for high-stake decision-making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the confidence of prediction results with fewer theoretical assumptions. To obtain the confidence set by so-called full-CP, we need to refit the predictor for all possible values of prediction results, which is only possible for simple predictors. For complex predictors such as random forests (RFs) or neural networks (NNs), split-CP is often employed where the data is split into two parts: one part for fitting and another to compute the confidence set. Unfortunately, because of the reduced sample size, split-CP is inferior to full-CP both in fitting as well as confidence set computation. In this paper, we develop a full-CP of sparse high-order interaction model (SHIM), which is sufficiently flexible as it can take into account high-order interactions among variables. We resolve the computational challenge for full-CP of SHIM by introducing a novel approach called homotopy mining. Through numerical experiments, we demonstrate that SHIM is as accurate as complex predictors such as RF and NN and enjoys the superior statistical power of full-CP.
- Asia > Japan (0.04)
- North America > United States > Oregon (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
Fast and More Powerful Selective Inference for Sparse High-order Interaction Model
Das, Diptesh, Duy, Vo Nguyen Le, Hanada, Hiroyuki, Tsuda, Koji, Takeuchi, Ichiro
Automated high-stake decision-making such as medical diagnosis requires models with high interpretability and reliability. As one of the interpretable and reliable models with good prediction ability, we consider Sparse High-order Interaction Model (SHIM) in this study. However, finding statistically significant high-order interactions is challenging due to the intrinsic high dimensionality of the combinatorial effects. Another problem in data-driven modeling is the effect of "cherry-picking" a.k.a. selection bias. Our main contribution is to extend the recently developed parametric programming approach for selective inference to high-order interaction models. Exhaustive search over the cherry tree (all possible interactions) can be daunting and impractical even for a small-sized problem. We introduced an efficient pruning strategy and demonstrated the computational efficiency and statistical power of the proposed method using both synthetic and real data.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States > Oregon (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Research Report > Experimental Study (0.69)
- Research Report > New Finding (0.48)
- Information Technology > Data Science > Data Mining (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)